Experiments with PageRank Computation
نویسندگان
چکیده
PageRank algorithm is one of the most commonly used algorithms that determine the global importance of web pages. Due to the size of web graph which contains billions of nodes, computing a PageRank vector is very computational intensive and it may takes any time between months to hours depending on the efficiency of the algorithm. This promoted many researchers to propose techniques to enhance the PageRank algorithm. The researchers investigated all aspects of PageRank algorithm that covers stability, convergence speed, memory consumption, I/O efficiency, and the connectivity matrix properties [2-7, 11]. However, some aspects of PageRank algorithm are left unstudied. In addition, very few techniques are building on the results of the others. In this work we investigate some PageRank properties and report our findings.
منابع مشابه
PageRank Computation and the Structure of the Web: Experiments and Algorithms
We describe some computational experiments carried out with variants of the “PageRank” model due to Page et al., with particular reference to the small and large scale structure of the Web.
متن کاملTraps and Pitfalls of Topic-Biased PageRank
We discuss a number of issues in the definition, computation and comparison of PageRank values that have been addressed sparsely in the literature, often with contradictory approaches. We study the difference between weakly and strongly preferential PageRank, which patch the dangling nodes with different distributions, extending analytical formulae known for the strongly preferential case, and ...
متن کاملWeb-Site-Based Partitioning Techniques for Reducing the Preprocessing Overhead before the Parallel PageRank Computations
The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. Due to the enormous size of the Web’s hyperlink structure, PageRank computations are usually carried out on parallel computers. Recently, a hypergraph-partitioning-based formulation for parallel sparse-matrix vector multiplication is propos...
متن کاملDo Your Worst to Make the Best: Paradoxical Effects in PageRank
Deciding which kind of visit accumulates high-quality pages more quickly is one of the most often debated issue in the design of web crawlers. It is known that breadth-first visits work well, as they tend to discover pages with high PageRank early on in the crawl. Indeed, this visit order is much better than depth first, which is in turn even worse than a random visit; nevertheless, breadth-fir...
متن کاملParadoxical Effects in PageRank Incremental Computations
Deciding which kind of visiting strategy accumulates high-quality pages more quickly is one of the most often debated issues in the design of web crawlers. This paper proposes a related, and previously overlooked, measure of effectivity for crawl strategies: whether the graph obtained after a partial visit is in some sense representative of the underlying web graph as far as the computation of ...
متن کامل